Skip to content

make: enable SECONDEXPANSION for module/application builds#12708

Merged
aabadie merged 1 commit intoRIOT-OS:masterfrom
kaspar030:make_enable_second_expansion
Nov 18, 2019
Merged

make: enable SECONDEXPANSION for module/application builds#12708
aabadie merged 1 commit intoRIOT-OS:masterfrom
kaspar030:make_enable_second_expansion

Conversation

@kaspar030
Copy link
Contributor

Contribution description

This PR enables GNU make SECONDEXPANSION for all module/application builds.

See the GNU make for more information.

This will make support for module source files in subfolders easy to implement.
See e.g., #11870.

This has been split out of #11870 in order to isolate any possible side effects.

Testing procedure

All build outputs should be binary identical.

Issues/PRs references

Split out of #11870.

@kaspar030 kaspar030 added Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation Area: build system Area: Build system CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Nov 14, 2019
@kaspar030 kaspar030 requested a review from aabadie November 14, 2019 11:24
@kaspar030
Copy link
Contributor Author

I tried to find a performance difference. I've used gnrc_networking as it has a high number of recursive make calls.

I'm using hyperfine for measurements, set to five warmup runs (-w5) and 50 test runs (-m 50). The test laptop has an Intel Core i5-8250U, is running Arch Linux, with GNU make 4.2.1.

This is with secondexpansion disabled:

[kaspar@ng riot/examples/gnrc_networking (make_enable_second_expansion)]$ hyperfine -w5 -m 50 "BOARD=samr21-xpro make all -j8"
Benchmark #1: BOARD=samr21-xpro make all -j8
  Time (mean ± σ):      1.151 s ±  0.006 s    [User: 1.587 s, System: 0.302 s]
  Range (min … max):    1.143 s …  1.183 s    50 runs

This is with secondexpansion enabled:

   [kaspar@ng riot/examples/gnrc_networking (make_enable_second_expansion)]$ hyperfine -w5 -m 50 "BOARD=samr21-xpro make all -j8"     
Benchmark #1: BOARD=samr21-xpro make all -j8                                                                                       
  Time (mean ± σ):      1.150 s ±  0.006 s    [User: 1.587 s, System: 0.305 s]
  Range (min … max):    1.142 s …  1.175 s    50 runs

TL;DR: no performance difference with current make version.

@miri64
Copy link
Member

miri64 commented Nov 14, 2019

To confirm your results I used the time util:

With .SECONDEXPANSION

time ( for i in $(seq 50); do; BOARD=samr21-xpro make -C examples/gnrc_networking clean all -j16; done )
real	1m12,623s
user	3m13,340s
sys	0m36,220s

Without .SECONDEXPANSION

time ( for i in $(seq 50); do; BOARD=samr21-xpro make -C examples/gnrc_networking clean all -j16; done )
real	1m0,045s
user	1m46,870s
sys	0m30,095s

In both cases I did not use ccache. Seems to me, that there is a performance trade-off with .SECONDEXPANSION.

@miri64
Copy link
Member

miri64 commented Nov 14, 2019

(done on an Ubuntu 19.10)

@kaspar030
Copy link
Contributor Author

(done on an Ubuntu 19.10)

@miri64 Are those results repeatable?

Using your for loop (with fixed "do;"), I still get the same results with/without SECONDEXPANSION.
Which make version is ubuntu 19.10 using?

@miri64
Copy link
Member

miri64 commented Nov 14, 2019

GNU Make 4.2.1. I'm trying to repeat the tests on a machine that isn't used during the test (an Arch Linux in this case however but also with make 4.2.1).

@miri64
Copy link
Member

miri64 commented Nov 14, 2019

(the Ubuntu machine is my main work machine and a lot of other things happened during the tests)

@miri64
Copy link
Member

miri64 commented Nov 14, 2019

Here are the results, I ran the experiment two times for every configuration this time, just to make sure I don't include any FS caching effects (so the reports below are from the second run):

time ( for i in $(seq 50); do BOARD=samr21-xpro make -C examples/gnrc_networking -j8 clean all; done ); time (for i in $(seq 50); do BOARD=samr21-xpro make -C examples/gnrc_networking -j8 clean all; done)

With .SECONDEXPANSION:

real	6m23.168s
user	38m8.172s
sys	4m43.219s

without:

real	6m23.352s
user	38m6.300s
sys	4m41.089s

The tests this time were run with 8 threads on an Intel i7-8550U (4 physical cores, 8 hyper threads), and not much else was running on the machine (nearly all services killed by hand before running the tests):

[mlenders@sofia RIOT]<3 sudo systemctl status
[sudo] password for mlenders: 
● sofia
    State: running
     Jobs: 0 queued
   Failed: 0 units
    Since: Thu 2019-11-14 15:22:04 CET; 6min ago
   CGroup: /
           ├─user.slice
           │ └─user-1000.slice
           │   ├─user@1000.service
           │   │ └─init.scope
           │   │   ├─877 /usr/lib/systemd/systemd --user
           │   │   └─878 (sd-pam)
           │   └─session-2.scope
           │     ├─ 866 login -- mlenders
           │     ├─ 884 -bash
           │     ├─1780 script -fa log-pr12708.txt
           │     ├─1781 bash -i
           │     ├─1886 sudo systemctl status
           │     ├─1887 systemctl status
           │     └─1888 less
           ├─init.scope
           │ └─1 /sbin/init
           └─system.slice
             ├─systemd-journald.service
             │ └─468 /usr/lib/systemd/systemd-journald
             └─lvm2-lvmetad.service
               └─475 /usr/bin/lvmetad -f

Find the detailed process on my nextcloud (link is valid until the end of this year).

@kaspar030
Copy link
Contributor Author

Thanks @miri64 for repeating the benchmarks!

@miri64
Copy link
Member

miri64 commented Nov 14, 2019

My Ubuntu isn't that much different from Arch when it comes to compiling, I noticed (make versions are the same, GCC is just one version behind [native GCC even one point release ahead!]) are the same. That's why I didn't repeat them on the Ubuntu machine with the same process.

@emmanuelsearch
Copy link
Member

@miri64 so what is your conclusion here, based on your tests? Is there some impact? Or not, as claimed by @kaspar030 ?

@miri64
Copy link
Member

miri64 commented Nov 15, 2019

There seems to be a (very small) speed trade-off that is negligible IMHO.

BTW I figured out that for my Ubuntu tests I apparently had ccache activated without my knowledge after all. That explains the insane speeds I got there.

@aabadie
Copy link
Contributor

aabadie commented Nov 15, 2019

That explains the insane speeds I got there.

Do you have an idea why ccache is introducing such a difference ? If ccache has an impact, I guess performance on Murdock could decrease, since it uses ccache to speed-up the builds.

@aabadie
Copy link
Contributor

aabadie commented Nov 15, 2019

Why only enabling this feature for modules/application and not also for all other targets prerequisites (from Makefile.include) ?
Maybe some prerequisites there also could benefit from this feature.

@miri64
Copy link
Member

miri64 commented Nov 15, 2019

I don't think ccache has a high impact... I think because ccache was activated, the execution was much quicker, so anything that happened unrelated to compilation during compilation was factored in much higher into the run-times.

@miri64
Copy link
Member

miri64 commented Nov 15, 2019

BTW I figured out that for my Ubuntu tests I apparently had ccache activated without my knowledge after all. That explains the insane speeds I got there.

I wanted to say with that: Please ignore my measurements from #12708 (comment). They may not be reliable ;-).

@kaspar030
Copy link
Contributor Author

Why only enabling this feature for modules/application and not also for all other targets prerequisites (from Makefile.include) ?
Maybe some prerequisites there also could benefit from this feature.

Honestly, because I need it only for modules/apps.

The flag would have to be set in bot Makefile.include and Makefile.base, as the former calls the latter recursively.

I suggest adding it to Makefile.include as soon as there's a user at least in sight?

Copy link
Contributor

@aabadie aabadie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @miri64 for the bench results. So apparently there's no impact on performance and this feature will be useful in #11870.

As suggested by @kaspar030, we can discuss later if it also be used in Makefile.include.

ACK and go.

@aabadie aabadie merged commit 5a70576 into RIOT-OS:master Nov 18, 2019
@kaspar030 kaspar030 deleted the make_enable_second_expansion branch November 18, 2019 15:33
@kaspar030
Copy link
Contributor Author

Thanks @miri64 @aabadie!

@fjmolinas fjmolinas added this to the Release 2020.01 milestone Dec 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: build system Area: Build system CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Type: enhancement The issue suggests enhanceable parts / The PR enhances parts of the codebase / documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants